Uncertain Groupings: Probabilistic Combination of Grouping Data

نویسندگان

  • Brend Wanders
  • Maurice van Keulen
  • Paul E. van der Vet
چکیده

Probabilistic approaches for data integration have much potential [7]. We view data integration as an iterative process where data understanding gradually increases as the data scientist continuously refines his view on how to deal with learned intricacies like data conflicts. This paper presents a probabilistic approach for integrating data on groupings. We focus on a bio-informatics use case concerning homology. A bio-informatician has a large number of homology data sources to choose from. To enable querying combined knowledge contained in these sources, they need to be integrated. We validate our approach by integrating three real-world biological databases on homology in three iterations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic Approach to Perceptual Grouping

distributions over the space of possible image feature groupings. for higher level processing [17, 34]. The framework can be used to find several of the most probable Less progress has been made in determining how feapartitions of image features into groupings, rather than just tures should be organized into more abstract structures to returning a single partition of the features as do most fea...

متن کامل

Risk factor groupings related to insulin resistance and their synergistic effects on subclinical atherosclerosis: the atherosclerosis risk in communities study.

The extent to which groupings of insulin resistance-related cardiovascular risk factors synergize to produce atherosclerosis beyond what is expected from their additive effects is uncertain. The objective of this study was to measure interactions among groupings of the features of the insulin resistance syndrome (IRS) on carotid intimal-medial thickness (IMT). This cross-sectional study used ba...

متن کامل

Historical limitations of determinant based exposure groupings in the rubber manufacturing industry.

AIMS To study the validity of using a cross-sectional industry-wide exposure survey to develop exposure groupings for epidemiological purposes that extend beyond the time period in which the exposure data were collected. METHODS Exposure determinants were used to group workers into high, medium, and low exposure groups. The contrast of this grouping and other commonly used grouping schemes ba...

متن کامل

A Grouping Method for Categorical Attributes Having Very Large Number of Values

In supervised machine learning, the partitioning of the values (also called grouping) of a categorical attribute aims at constructing a new synthetic attribute which keeps the information of the initial attribute and reduces the number of its values. In case of very large number of values, the risk of overfitting the data increases sharply and building good groupings becomes difficult. In this ...

متن کامل

Designing a learning approach to community-based decision-making: Using analytical statistics and decision trees to optimize a data structure

This paper describes an effort to predict alumni success as measured by employer satisfaction in a veterinary medical education environment. Due to the complexity of the data, this paper sought to augment regression analysis with decision tree analysis to see if the combination of both approaches could result in new insights. Potential predictors of employer satisfaction were characterized as e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015